View AN4031_8216071.PDF datasheet online --- IC-ON-LINE

Datasheet File OCR Text:

february 2014 docid022648 rev 1 1/36 AN4031 application note using the stm32f2 and stm32f4 dma controller introduction this application note describes how to use th e stm32f2xx and stm32f4xx direct memory access (dma) controller. the stm32f2xx/f4xx dma controller features, the system architecture, the multi-layer bus matrix and the memory system contribute to provide a high data bandwidth and to develop very low latency response-time software. this application note also describes some tips and tricks to allow de velopers to take full advantage of these features an d ensure correct response times for different peripherals and subsystems. stm32f2xx and stm32f4xx are referred to as ?stm32f2/f4 devices? and the dma controller as ?dma? throughout the document. this application note applies to the products listed in table 1 . this application note should be read in conjunction with the stm32f2/f4 reference manuals (rm0031, rm0090 and rm0368). table 1. applicable products type part numbers microcontrollers stm32f2xx (stm32f205, stm32f207, stm32f215, stm32f217) stm32f4xx (stm32f401, stm32f405, stm32f407, stm32f415, stm32f417, stm32f427, stm32f 429, stm32f437, stm32f439) www.st.com
contents AN4031 2/36 docid022648 rev 1 contents 1 dma controller description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.1 dma transfer properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.1.1 dma streams/channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.1.2 stream priority . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.1.3 source and destination addresses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.1.4 transfer mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.1.5 transfer size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.1.6 incrementing source/destination address . . . . . . . . . . . . . . . . . . . . . . . 11 1.1.7 source and destination data width . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.1.8 transfer types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.1.9 dma fifo mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 1.1.10 source and destination burst size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 1.1.11 double-buffer mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.1.12 flow control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 1.2 setting up a dma transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2 system performance considerations . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.1 multi-layer bus matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.1.1 definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.1.2 round-robin priority scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.1.3 busmatrix arbitration and dma transfer delays worst case . . . . . . . . . . 20 2.2 dma transfer paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.1 dual dma port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 2.2.2 dma transfer states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.2.3 dma request arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.3 ahb-to-apb bridge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.3.1 dual ahb-to-apb port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 2.3.2 ahb-to-apb bridge arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3 how to predict dma latencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.1 dma transfer time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.1.1 default dma transfer timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.1.2 dma transfer time versus concurrent access . . . . . . . . . . . . . . . . . . . . 28 3.2 examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
docid022648 rev 1 3/36 AN4031 contents 3 3.2.1 adc-to-sram dma transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2.2 spi full duplex dma transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 4 tips and warnings while programming th e dma controller . . . . . . . . 32 5 conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 6 revision history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
list of tables AN4031 4/36 docid022648 rev 1 list of tables table 1. applicable products . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 table 2. dma1 request mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 table 3. dma2 request mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 table 4. dma1 request mapping for stm32f401 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 table 5. dma2 request mapping for stm32f401 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 table 6. possible burst configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 table 7. peripheral port access/transfer time versus dma path used . . . . . . . . . . . . . . . . . . . . . . . 28 table 8. memory port access/transfer time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 table 9. dma peripheral (adc) port transfer latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 table 10. dma memory (sram) port transfer latency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 table 11. document revision history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
docid022648 rev 1 5/36 AN4031 list of figures 5 list of figures figure 1. dma block diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 figure 2. channel selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 figure 3. dma source address and destination address incr ementing . . . . . . . . . . . . . . . . . . . . . . . 11 figure 4. fifo structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 figure 5. dma burst transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 figure 6. double-buffer mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 figure 7. system architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 figure 8. cpu and dma1 request an access to sram1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 figure 9. five masters request sram access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 figure 10. dma transfer delay due to cpu transfer issued by interrupt . . . . . . . . . . . . . . . . . . . . . . . 21 figure 11. dma dual port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 figure 12. peripheral-to-memory transfer stat es . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 figure 13. memory-to-peripheral transfer st ates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 figure 14. dma request arbitration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 figure 15. ahb-to-apb1 bridge concurre nt cpu and dma1 access request . . . . . . . . . . . . . . . . . . . 26 figure 16. spi full duplex dma transfer time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
dma controller description AN4031 6/36 docid022648 rev 1 1 dma controller description the dma is an amba advanced high-performa nce bus (ahb) module that features three ahb ports: a slave port for dma programmi ng and two master ports (peripheral and memory ports) that allow the dma to initiate data transfers between different slave modules. the dma allows data transfers to take place in the background, without the intervention of the cortex-mx processor. during this operati on, the main processor can execute other tasks and it is only interrupted when a whole data block is available for processing. large amounts of data can be transferred with no major impact on the system performance. the dma is mainly used to implement central data buffer storage (usually in the system sram) for different peripheral modu les. this solution is less expensive in terms of silicon and power consumption compared to a distributed solution where each peripheral needs to implement it own local data storage. the stm32f2/f4 dma controller takes full advantage of the cortex-mx harvard architecture and the multi-layer bus system in order to ensure very low latency both for dma transfers and for cpu execution/in terrupt event detection/service. 1.1 dma transfer properties a dma transfer is characterized by the following properties: ? dma stream/channel ? stream priority ? source and destination addresses ? transfer mode ? transfer size (only when dma is the flow controller) ? source/destination address incrementing or non-incrementing ? source and destination data width ? transfer type ? fifo mode ? source/destination burst size ? double-buffer mode ? flow control stm32f2/f4 devices embed two dma controllers, and each dma has two port, one peripheral port and one memory port, which can work simultaneously. figure 1 shows the dma block diagram.
docid022648 rev 1 7/36 AN4031 dma controller description 35 figure 1. dma block diagram the following subsections provide a detailed de scription of each dma transfer property. 1.1.1 dma streams/channels stm32f2/f4 devices embed two dma controllers, offering up to 16 streams in total (eight per controller), each dedicated to managing memory access requests from one or more peripherals. each stream has up to eight selectable channels (requests) in total. this selection is software-configurable and allows several peripherals to initiate dma requests. figure 2 describes the channel selection for a dedicated stream. ahb master memory port fifo ahb master peripheral port stream 0 fifo stream 1 stream 0 stream 1 fifo stream 2 stream 2 fifo stream 7 stream 7 req_stream0 req_str0_ch0 req_str0_ch1 dma controller fifo stream 3 stream 3 fifo stream 4 stream 4 fifo stream 5 stream 5 fifo stream 6 stream 6 arbiter req_stream1 req_stream2 req_stream3 req_stream4 req_stream5 req_stream6 req_stream7 req_str0_ch7 req_str1_ch0 req_str1_ch1 req_str1_ch7 req_str7_ch0 req_str7_ch1 req_str7_ch7 ahb slave programming interface programming port channel selection ai15945
dma controller description AN4031 8/36 docid022648 rev 1 figure 2. channel selection note: o nly one channel/request can be active at the same time in a stream. more than one enabled dma stream must not serve the same peripheral request. table 2 and table 3 show the possible configurations of dma streams/channels versus peripheral requests for all the supported produc ts except stm32f401, which is described in table 4 and table 5 . req_streamx req_strx_ch7 req_strx_ch6 req_strx_ch5 req_strx_ch4 req_strx_ch3 req_strx_ch2 req_strx_ch1 req_strx_ch0 chsel[2:0] 31 29 27 0 dma_sxcr ai15947 table 2. dma1 request mapping peripheral requests stream 0 stream 1 stream 2 stream 3 stream 4 stream 5 stream 6 stream 7 channel 0 spi3_rx - spi3_rx spi2_rx spi2_tx spi3_tx - spi3_tx channel 1 i2c1_rx - tim7_up tim7_up i2c1_rx i2c1_tx i2c1_tx channel 2 tim4_ch1 - i2s3_ext_ rx tim4_ch2 i2s2_ext_ tx i2s3_ext_ tx tim4_up tim4_ch3 channel 3 i2s3_ext_ rx tim2_up tim2_ch3 i2c3_rx i2s2_ext_ rx i2c3_tx tim2_ch1 tim2_ch2 tim2_ch4 tim2_up tim2_ch4 channel 4 uart5_rx usart3_rx uart4_rx usart3_tx uart4_tx usart2_rx usart2_tx uart5_tx channel 5 uart8_tx (1) uart7_tx (1) tim3_ch4 tim3_up uart7_rx (1) tim3_ch1 tim3_trig tim3_ch2 uart8_rx (1) tim3_ch3 channel 6 tim5_ch3 tim5_up tim5_ch4 tim5_trig tim5_ch1 tim5_ch4 tim5_trig tim5_ch2 - tim5_up - channel 7 - tim6_up i2c2_rx i2c2_rx usart3_tx dac1 dac2 i2c2_tx 1. these requests are available on stm32f42xx and stm32f43xx only.
docid022648 rev 1 9/36 AN4031 dma controller description 35 table 4 and table 5 show the possible configurations of dma streams/channels versus peripheral requests for stm32f401 products. table 3. dma2 request mapping peripheral requests stream 0 stream 1 stream 2 stream 3 stream 4 stream 5 stream 6 stream 7 channel 0 adc1 sai1_a (1) tim8_ch1 tim8_ch2 tim8_ch3 sai1_a (1) adc1 sai1_b (1) tim1_ch1 tim1_ch2 tim1_ch3 - channel 1 - dcmi adc2 adc2 sai1_b (1) spi6_tx (1) spi6_rx (1) dcmi channel 2 adc3 adc3 - spi5_rx (1) spi5_tx (1) cryp_out cryp_in hash_in channel 3 spi1_rx - spi1_rx spi1_tx - spi1_tx - - channel 4 spi4_rx (1) spi4_tx (1) usart1_rx sdio - usart1_rx sdio usart1_tx channel 5 - usart6_rx usart6_rx spi4_rx (1) spi4_tx (1) - usart6_tx usart6_tx channel 6 tim1_trig tim1_ch1 tim1_ch2 tim1_ch1 tim1_ch4 tim1_trig tim1_com tim1_up tim1_ch3 - channel 7 - tim8_up tim8_ch1 tim8_ch2 tim8_ch3 spi5_rx (1) spi5_tx (1) tim8_ch4 tim8_trig tim8_com 1. these requests are available on stm32f42xx and stm32f43xx only. table 4. dma1 request mapping for stm32f401 peripheral requests stream 0 stream 1 stream 2 stream 3 stream 4 stream 5 stream 6 stream 7 channel 0 spi3_rx - spi3_rx spi2_rx spi2_tx spi3_tx - spi3_tx channel 1 i2c1_rx i2c 3 _rx - - - i2c1_rx i2c1_tx i2c1_tx channel 2 tim4_ch1 - i2s3_ext_r x tim4_ch2 i2s2_ext_tx i2s3_ext_tx tim4_up tim4_ch3 channel 3 i2s3_ext_rx tim2_up tim2_ch3 i2c3_rx i2s2_ext_r x i2c3_tx tim2_ch1 tim2_ch2 tim2_ch4 tim2_up tim2_ch4 channel 4 ----- usart2_rx usart2_tx - channel 5 -- tim3_ch4 tim3_up - tim3_ch1 tim3_trig tim3_ch2 - tim3_ch3 channel 6 tim5_ch3 tim5_up tim5_ch4 tim5_trig tim5_ch1 tim5_ch4 tim5_trig tim5_ch2 i2c 3 _tx tim5_up - channel 7 -- i2c2_rx i2c2_rx --- i2c2_tx
dma controller description AN4031 10/36 docid022648 rev 1 stm32f2/f4 dma request mapping is designed in such a way that the software application has more flexibility to map each dma request for the associated peripher al request, and that most of the use case applications are covered by multiplexing the corresponding dma streams and channels. 1.1.2 stream priority each dma port has an arbiter for handling the priority between other dma streams. stream priority is software-configurable (there are four software levels). if two or more dma streams have the same software priority level, the hardware priority is used (stream 0 has priority over stream 1, etc.). 1.1.3 source and destination addresses a dma transfer is defined by a source address and a destination address. both the source and destination should be in the ahb or a pb memory ranges and should be aligned to transfer size. 1.1.4 transfer mode dma is capable of performing three different transfer modes: ? peripheral to memory, ? memory to peripheral, ? memory to memory (only dma2 is able to do such transfer, in this mode, the circular and direct modes are not allowed.) table 5. dma2 request mapping for stm32f401 peripheral requests stream 0 stream 1 stream 2 stream 3 stream 4 stream 5 stream 6 stream 7 channel 0 adc1 --- adc1 - tim1_ch1 tim1_ch2 tim1_ch3 - channel 1 -------- channel 2 -------- channel 3 spi1_rx - spi1_rx spi1_tx - spi1_tx -- channel 4 spi4_rx spi4_tx usart1_rx sdio - usart1_rx sdio usart1_tx channel 5 - usart6_rx usart6_rx spi4_rx spi4_tx - usart6_tx usart6_tx channel 6 tim1_trig tim1_ch1 tim1_ch2 tim1_ch1 tim1_ch4 tim1_trig tim1_com tim1_up tim1_ch3 - channel 7 --------
docid022648 rev 1 11/36 AN4031 dma controller description 35 1.1.5 transfer size the transfer size value has to be defined only when the dma is the flow controller. in fact, this value defines the volume of data to be transferred from source to destination. the transfer size is defined by the dma_sxnd tr register value and by the peripheral side data width. depending on the received request (burst or single), the transfer size value is decreased by the amount of the transferred data. 1.1.6 incrementing sour ce/destination address it is possible to configure th e dma to automatically increment the source and/or destination address after each data transfer. figure 3. dma source address and destination address incrementing 1.1.7 source and destination data width data width for source and destination can be defined as: byte (8 bits) half-word (16 bits) word (32 bits) 1.1.8 transfer types ? circular mode: the circular mode is available to handle circular buffers and continuous data flows (the dma_sxndtr register is then reloaded au tomatically with the previously programmed value). ? normal mode: once the dma_sx ndtr register reaches zero, the stream is disabled (the en bit in the dma_sxcr register is then equal to 0). 1.1.9 dma fifo mode each stream has an independent 4-word (4 * 32 bits) fifo and the threshold level is software-configurable between 1/4, 1/2, 3/4 or full. the fifo is used to temporarily store data coming from the sour ce before transmitting them to the destination. dma fifo can be enabled or disabled by software; when disabled, the direct mode is used. if dma fifo is enabled, data packing/unpacking and/or burst mode can be used. the configured dma fifo threshold defines the dma memory port request time. 06y9 d e f g h d e '0$gdwdwudqvihu 6rxufhdgguhvv 'hvwlqdwlrqdgguhvv ,qfuhphqwlqjghvwlqdwlrq ,qfuhphqwlqjvrxufh
dma controller description AN4031 12/36 docid022648 rev 1 the dma fifos implemented on stm32f2/f4 devices help to: ? reduce sram access and so give more time for the other masters to access the bus matrix without additional concurrency, ? allow software to do burs t transactions which optimi ze the transfer bandwidth, ? allow packing/unpacking data to adapt source and destination data width with no extra dma access. figure 4. fifo structure source: byte 4 words byte lane 0 byte lane 1 byte lane 2 byte lane 3 1/4 1/2 3/4 full empty b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 b 11 b12 b13 b14 b15 destination: word source: byte destination: half-word 4 words byte lane 0 byte lane 1 byte lane 2 byte lane 3 1/4 1/2 3/4 full empty b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 b 11 b12 b13 b14 b15 w0 w1 w2 w3 h0 h1 h2 h3 h4 h5 h6 h7 source: half-word destination: word 4 words byte lane 0 byte lane 1 byte lane 2 byte lane 3 1/4 1/2 3/4 full empty h0 w0 w1 w2 w3 h1 h2 h3 h4 h5 h6 h7 b15 b14 b13 b12 b11 b10 b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 b15 b14 b13 b12 b11 b10 b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 h7 h6 h5 h4 h3 h2 h1 h0 h7, h6, h5, h4, h3, h2, h1, h0 w3, w2, w1, w0 w3, w2, w1, w0 source: half-word 4-words byte lane 0 byte lane 1 byte lane 2 byte lane 3 1/4 1/2 3/4 full empty destination: byte h7 h6 h5 h4 h3 h2 h1 h0 b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 b 11 b12 b13 b14 b15 h0 h1 h2 h3 h4 h5 h6 h7 b15 b14 b13 b12 b11 b10 b9 b8 b7 b6 b5 b4 b3 b2 b1 b0 ai15951
docid022648 rev 1 13/36 AN4031 dma controller description 35 1.1.10 source and d estination burst size burst transfers are guaranteed by the implemented dma fifos. figure 5. dma burst transfer in response to a burst request from periphera l dma reads/writes the number of data units (data unit can be a word, a half-word, or a byte ) programmed by the burst size (4x, 8x or 16x data unit). the burst size on the dma peripheral port must be set according to the peripheral needs/capabilities. the dma burst size on the memory port and th e fifo threshold configuration must match. this allows the dma stream to have enough data in the fifo when burst transfer on the memory port is started. table 6 shows the possible combinations of memory burst size, fifo threshold configuration and data size. to ensure data coherence, each group of transfers that form a burst is indivisible: ahb transfers are locked and the arbiter of the ahb bus matrix does not remove the dma master?s access rights during the burst transfer sequence. 06y9 3hulskhudouhtxhvw 65$0 1xpehuriuhjlvwhuvwreh wudqvihuuhglqexuvw %xuvw'0$wudqvihu d 3hulskhudo :25' :25' :25' :25' '0$6wuhdp 6wuhdp),)2
dma controller description AN4031 14/36 docid022648 rev 1 1.1.11 double-buffer mode a double-buffer stream works as a regular (single-buffer) stream, with the difference that it has two memory pointers. when the double-buffer mode is enabled, the circular mode is automatically enabled and at each end of tran saction (dma_sxndtr register reach 0), the memory pointers are swapped. this allows the software to process one memory area while the second memory area is being filled/used by the dma transfer. figure 6. double-buffer mode table 6. possible bur st configurations msize fifo level mburst = incr4 m burst = incr8 mburst = incr16 byte 1/4 1 burst of 4 bytes forbidden forbidden 1/2 2 bursts of 4 bytes 1 burst of 8 bytes 3/4 3 bursts of 4 bytes forbidden full 4 bursts of 4 bytes 2 bursts of 8 bytes 1 burst of 16 bytes half-word 1/4 forbidden forbidden forbidden 1/2 1 burst of 4 half-words 3/4 forbidden full 2 bursts of 4 half-words 1 burst of 8 half-word word 1/4 forbidden forbidden 1/2 3/4 full 1 burst of 4 words 06y9 0hpru\orfdwlrq 0hpru\orfdwlrq '0$b6[0$5 '0$b6[0$5 '0$b6[3$5 &7 7& +7 3hulskhudogdwd uhjlvwhu &7 &7
docid022648 rev 1 15/36 AN4031 dma controller description 35 in double-buffer mode, it is possible to update the base address for the ahb memory port on-the-fly (dma_sxm0ar or dma_sxm1ar) when the stream is enabled: ? when the ct (current target) bit in the dma_sxcr register is equal to 0, the current dma memory target is memory location 0 and so the base address memory location 1 (dma_sxm1ar) can be updated. ? when the ct bit in the dma_sxcr register is equal to 1, the current dma memory target is memory location 1 and so the base address memory location 0 (dma_sxm0ar) can be updated. 1.1.12 flow control the flow controller is the unit that controls t he data transfer length and which is responsible for stopping the dma transfer. the flow controller can be either the dma or the peripheral. ? with dma as flow controller: in this case, it is necessary to define the tr ansfer size value in the dma_sxndtr register before enabling the associated dma stream. when a dma request is served, the transfer size value decreases by the amount of transferr ed data (depending of the type of request: burst or single). when the transfer size value reaches 0, the dm a transfer is finished and the dma stream is disabled. ? with the peripheral as flow controller: this is the case when the number of data items to be transferred is unknown. the peripheral indicates by hardware to the dma controller when the last data are being transferred. only the sd/mmc peripheral supports this mode. 1.2 setting up a dma transfer to configure dma stream x (where x is the stream number), the following procedure should be applied: 1. if the stream is enabled, disable it by resetting the en bit in the dma_sxcr register, then read this bit in order to confirm that there is no ongo ing stream operation. writing this bit to 0 is not immediately effective si nce it is actually written to 0 once all the current transfers have finished. when the en bit is read as 0, this means that the stream is ready to be configured. it is ther efore necessary to wait for the en bit to be cleared before starting any stream configur ation. all the stream-dedicated bits set in
dma controller description AN4031 16/36 docid022648 rev 1 the status register (dma_lisr and dma_hi sr) from the previous data block dma transfer should be cleared before the stream can be re-enabled. 2. set the peripheral port re gister address in the dma_sx par register. the data will be moved from/to this address to/from the pe ripheral port after the peripheral event. 3. set the memory address in the dma_sxma0r register (and in the dma_sxma1r register in the case of a do uble-buffer mode ). the data will be writte n to or read from this memory after the peripheral event. 4. configure the total number of data items to be transferred in the dma_sxndtr register. after each peripheral event or each beat of the burst, this value is decremented. 5. select the dma channel (request) usin g chsel[2:0] in the dma_sxcr register. 6. if the peripheral is intended to be the flow controller and if it supports this feature, set the pfctrl bit in the dma_sxcr register. 7. configure the stream priority using the pl[1:0] bits in the dma_sxcr register. 8. configure the fifo usage (enable or disabl e, threshold in transmission and reception). 9. configure the data transfer direction, pe ripheral and memory incremented/fixed mode, single or burst transactions, peripheral and memory data widths, circular mode, double-buffer mode and interrupts after half and/or full transfer, and/or errors in the dma_sxcr register. 10. activate the stream by setting th e en bit in the dma_sxcr register. as soon as the stream is enabled, it can serve any dma request from the peripheral connected to the stream.
docid022648 rev 1 17/36 AN4031 system performance considerations 35 2 system performance considerations stm32f2/f4 devices embed a multi-ma sters/multi-slaves architecture: ? eight masters: ? cortex?-mx core i-bus ? cortex?-mx core d-bus ? cortex?-mx core s-bus ? dma1 memory bus ? dma2 memory bus ? dma2 peripheral bus ? ethernet dma bus ? usb high-speed dma bus ? eight slaves: ? internal flash memory icode bus ? internal flash memory dcode bus ? main internal sram1 (112 kb, 64 kb on stm32f401x) ? auxiliary internal sram2 (16 kb) (not available on stm32f401) ? auxiliary internal sram3 (64 kb) ava ilable only on stm32f42x/f43x devices ? ahb1 peripherals including ahb-to -apb bridges and apb peripherals ? ahb2 peripherals ? ahb3 peripheral (fmc) (not available on stm32f401) masters and slaves are connected via a multi-la yer bus matrix ensuring concurrent access and efficient operation, even when several hi gh-speed peripherals work simultaneously. this architecture is shown in the next figure for the case of stm32f40x/f41x.
system performance considerations AN4031 18/36 docid022648 rev 1 figure 7. system architecture 2.1 multi-layer bus matrix the multi-layer bus matrix allows masters to perform data transfe rs concurrently as long as they are addressing different slave modules. on top of the cortex-mx harvard architecture and dual ahb port dmas, this structure enhances data transfer parallelism, thus contributing to reduce the execution time, and optimizing the dma efficiency and power consumption. 2.1.1 definitions ? ahb master: a bus master is ab le to initiate read and writ e operations. only one master can win bus ownership at a defined time period. ? ahb slave: a bus slave respon se to master read or writ e operations. the bus slave signals back to master succe ss, failure or waiting states. ? ahb arbiter: a bus arbiter insures that only one master can initiate a read or write operation at one time. ? ahb bus matrix: a multi-layer ahb bus matrix that interconnects ahb masters to ahb slaves with dedicated ahb arbiter for each layer. the arbitration uses a round-robin algorithm. ),)26wuhdpv 069 65$0 .% 65$0 .% )60& $+% 0xowl$+%%xv0dwul[ $57 $ffhohudwru 0 xo w l $ + % %x v 0 d w ul[ &257(; 0 0+] z)38 038 '%xv ,&rgh '&rgh 'xdo3ruw '0$ $+% $3% 'xdo3ruw $+%$3% )/$6+ 0e\whv 'xdo3ruw '0$ &&0 gdwd5$0 .% +ljk6shhg 86% (wkhuqhw 0dvwhu 0dvwhu 0dvwhu 0dvwhu 0dvwhu ),)2'0$ (wkhuqhw 86%b+6 '0$ 0(0 '0$ 3 '0$ 3 '0$ 0(0 'xdo3ruw $+%$3% ),)2'0$ ),)26wuhdpv 6%xv ,%xv
docid022648 rev 1 19/36 AN4031 system performance considerations 35 2.1.2 round-robin priority scheme a round-robin priority scheme is implemented at bus matrix level in order to ensure that each master can access any slave with very low latency: ? round-robin arbitration policy allows a fair distribution of bus bandwidth. ? maximum latency is bounded. ? round-robin quantum is 1x transfer. bus matrix arbiters intervene to solve access conflicts when several ahb masters try to access the same ahb slave simultaneously. in the following example ( figure 8 ), both the cpu and dma1 tr y to access sram1 to read data. figure 8. cpu and dma1 request an access to sram1 in case of bus access concur rency as in the above example, a bus matrix arbitration is required. the round-robin policy is then applied in order to solve the issue: if the last master which wan the bus was the cpu, during the next access dma1 wins the bus and accesses sram1 first. the cpu has then the rights to access sram1. this proves that the transfer latency associ ated to one master depends on the number of other pending master requests to access th e same ahb slave. in the following example ( figure 9 ), five masters try to a ccess simultaneously sram1. 069 'xdo3ruw $+%$3% 'xdo3ruw $+%$3% &257(;0 0+] )38 038 0dvwhu 65$0 .% xv0dwul[ $+% $+% 'xdo3ruw '0$ 0dvwhu ),)26wuhdpv '%xv 6%xv ,%xv '0$ 0(0 '0$ 3 x v 0 d w u l [
system performance considerations AN4031 20/36 docid022648 rev 1 figure 9. five masters request sram access the latency associated to dm a1 to win the bus matrix again and access sram1 (for example) is equal to the execution time of all pending requests coming from the other masters. 2.1.3 busmatrix arbitration and dma transfer de lays worst case the latency seen by the dma master port on one transaction depends on the other masters? transfer types and lengths. for instance, if we consider previous dma1 & cpu example ( figure 8 ) with concurrency to access sram, latency on the dma transfer varies depending on the cpu transaction length. if bus access is first granted to the cpu and the cpu is not performing a single data load/store, the dma wait time to gain access to sram can ex pand from one ahb cycle for a single data load/store to n ahb cycles, wher e n is the number of data words in the cpu transaction. the cpu locks the ahb bus to keep ownershi p and reduces laten cy during multiple load/store operations and interrupts entry. this enhances firmware responsiveness but it can result in delays on the dma transaction. delay on dma1 sram access when in concurrency with cpu depends on the type of transfer: ? cpu transfer issued by interrup t (context save): 8 ahb cycles ? cpu transfer issued by ldm/st m instructions: 14 ahb cycles (a) ? transfers of up to 14 registers from/to memory 06y9 wh d d? h^z,^ d, wh d 4xdqwxp $+%f\foh 0dvwhudffhvvlqj65$0 '0$odwhqf\ 7lph a. latency due to transfer issued by ldm/stm instructions can be reduced by configuring compiler to split load/store multiple instructions into single load/store instructions.
docid022648 rev 1 21/36 AN4031 system performance considerations 35 figure 10. dma transfer delay due to cpu transfer issued by interrupt the above figure details the case of a dma tran sfer delayed by a cpu multi-cycle transfer due to an interrupt entry. dma memory port is triggered to perform a memory access. after arbitration, ahb bus is not granted to dma1 memory port but to cpu system bus. an additional delay is observed to serve the dma re quest. it is 8 ahb cycles for a cpu transfer issued by interrupt. the same behavior can be observed with ot her masters (like dma2, usb_hs, ethernet?) when addressing simultaneously the same slave with a transaction length different from one data unit. in order to improve dma access performance over busmatrix, it is recommended to avoid bus contention. 2.2 dma transfer paths 2.2.1 dual dma port stm32f2/f4 devices embed two dmas. each dma has two ports, a memory port and a peripheral port, which can operate simultaneousl y not only at dma level but also with other system masters, using the external bus matrix and dedicated dma paths. the simultaneous operation allows to optimize dma efficiency and to reduce response time (wait time between request and data transfer). 069 [ 'hod\ '0$ 0h0 uht '0$ &38 'hod\wrvhuyhwkhuhtxhvw &0 '0$ $5%
system performance considerations AN4031 22/36 docid022648 rev 1 figure 11. dma dual port for dma2: ? the mem (memory port) can access ahb1, ahb2, sram1, sram2, fsmc and flash memory d-code through the bus matrix. ? the periph (peripheral port) can access: ? ahb1, ahb2, sram1, sram2, fsmc and flash memory d-code through the bus matrix, ? the ahb-to-apb2 bridge through a dire ct path (not cro ssing the bus matrix). for dma1: ? the mem (memory port) can access sram1, sram2, fsmc, flash memory d-code through the bus matrix. ? the periph (peripheral port) can only access the ahb-to-apb1 bridge through a direct path (not crossing the bus matrix). 069 ),)26wuhdpv 65$0 .% 65$0 .% )60& $+% 0xowl$+%%xv0dwul[ $57 $ffhohudwru &257(; 0 0+] z)38 038 '%xv ,&rgh '&rgh 'xdo3ruw '0$ $+% $3% 'xdo3ruw $+%$3% )/$6+ 0e\whv 'xdo3ruw '0$ &&0 gdwd5$0 .% 0dvwhu 0dvwhu 0dvwhu '0$ 0(0 '0$ 3 '0$ 3 '0$ 0(0 'xdo3ruw $+%$3% ),)26wuhdpv 6%xv ,%xv
docid022648 rev 1 23/36 AN4031 system performance considerations 35 2.2.2 dma transfer states this section explains the dma transfer steps at the peripheral port level and also at the memory port level: ? for a peripheral-to-memory transfer: in this transfer mode, dma requires two bus accesses to perform the transfer: ? one access over the peripheral port triggered by the peripheral?s request, ? one access over the memory port which can be triggered either by the fifo threshold (when fifo mode is used) or immediately after peripheral read (when direct mode is used). figure 12. peripheral-to-memory transfer states ? for a memory-to-peripheral transfer: in this transfer mode, dma requires two bus accesses to perform the transfer: ? dma anticipates the peripheral?s access and reads data from the memory and stores it in fifo to ensure an immedi ate data transfer as soon as a dma peripheral request is triggered. ? when a peripheral request is triggered, a transfer is generated on the dma peripheral port. 06y9 1hz uhtxhvw 5htxhvw duelwudwlrq 7udqvihu gdwd 6wruhgdwd lq),)2 3hulskhudovlgh 1hz),)2 uhtxhvw 5htxhvw duelwudwlrq 7udqvihu gdwd (qgri wudqvihu 0hpru\vlgh
system performance considerations AN4031 24/36 docid022648 rev 1 figure 13. memory-to-pe ripheral transfer states 2.2.3 dma request arbitration as described in section 1.1.2 , the stm32f2/f4 dma embeds an arbiter that manages the eight dma stream requests based on their prio rities for each of the two ahb master ports (memory and peripheral ports) and launches the peripheral/memory access sequences. when more than one dma request is active, dma needs to arbiter internally between the active requests and decide which request is to be served first. the following figure shows two circular dma re quests triggered at the same time by dma stream ?request 1? and by dma stream ?req uest 2? (requests 1 and 2 could be any dma peripheral request). at the next ahb clock cyc le, the dma arbiter checks on the active pending requests and grants access to the ?request 1? stream which has the highest priority. the next arbitration cycle occurs during the last data cycle of the ?request 1? stream. at that time, ?request 1? is masked and the arbiter se es only ?request 2? as active, so access is reserved to ?request 2? this time, and so on. figure 14. dma request arbitration 06y9 1hz uhtxhvw 5htxhvw duelwudwlrq 7udqvihu gdwd 6wruhgdwd lq),)2 1hz),)2 uhtxhvw 5htxhvw duelwudwlrq 7udqvihu gdwd (qgri wudqvihu 3hulskhudovlgh 0hpru\vlgh 06y9 5htxhvw vwdwxv 1rwdfwlyh 1rwdfwlyh 5 wuljjhuhg 5 wuljjhuhg '0$ duelwudwlrq '0$vhuylflqj5 5dfwlyh 5 pdvnhg 5dfwlyh '0$ duelwudwlrq '0$vhuylflqj5 5 pdvnhg '0$ duelwudwlrq '0$vhuylflqj5 5dfwlyh 5 pdvnhg 5dfwlyh '0$ duelwudwlrq 5htxhvw vwdwxv '0$ duelwudwlrq '0$ wudqvihu
docid022648 rev 1 25/36 AN4031 system performance considerations 35 general recommendations: ? the high-speed/high-bandwidth peripherals mu st have the highest dma priorities. this ensures that the maximum data latency is respected for these peripherals and over- /under-run conditions are avoided. ? in case of equal bandwidth requirements, it is recommended to assign a higher priority to the peripherals working in slave mode (which have no control on the data transfer speed) compared with the ones working in master mode (which may control the data flow). ? as the two dmas can work in parallel based on the bus ma trix multi-layer structure, high-speed peripherals? requests can be balanced between the two dmas when possible. 2.3 ahb-to-apb bridge stm32f2/f4 devices embed two ahb-to-apb bridges, apb1 and apb2, to which the peripherals are connected. 2.3.1 dual ahb-to-apb port the stm32f2/f4 ahb-to- apb bridge is a dual-port architectu re that allows access through two different paths: ? a direct path (not crossing the bus matrix) that can be generated from dma1 to apb1 or from dma2 to apb2; in th is case, access is not penalized by the bus matrix arbiter. ? a common path (through the bus matrix) that can be generated either from the cpu or from dma2, which needs the bus matrix arbitration to win the bus. 2.3.2 ahb-to-apb bridge arbitration due to dma?s direct paths impl ementation on these products, an arbiter is implemented at the ahb-to-apb bridge level to solv e concurrent access requests. the following figure illustrate s a concurrent access requ est at an ahb-apb1 bridge generated by the cpu (accessed through the bus matrix) and dma1 (accessed through direct path).
system performance considerations AN4031 26/36 docid022648 rev 1 figure 15. ahb-to-apb1 bridge conc urrent cpu and dma1 access request to grant bus access, the ahb-apb br idge applies the round-robin policy: ? round-robin quantum is 1x apb transfer. ? max latency on dma pe ripheral port is bo unded (1 apb transfer). only the cpu and dmas can ge nerate a concurrent access to the apb1 and apb2 buses: ? for apb1, a concurrent access can be generated if the cpu, dma1 and/or dma2 request simultaneous access. ? for apb2, a concurrent access can be generated if the cpu and dma2 request simultaneous access. 069 65$0 .% $+% 0xowl$+%%xv0dwul[ '%xv 'xdo3ruw '0$ 0dvwhu '0$ 0(0 '0$ 3 ),)26wuhdpv 6%xv ,%xv $+% &257(; 0 0+] z)38 038 0dvwhu 'xdo3ruw $+%$3%
docid022648 rev 1 27/36 AN4031 how to predict dma latencies 35 3 how to predict dma latencies when designing a firmware application based on a microcontroller, the user must ensure that no underrun/overrun can occur, and that?s why knowing the exact dma latency for each transfer is mandatory to check if the intern al system can sustain the total data bandwidth required for the application. 3.1 dma transfer time 3.1.1 default dma transfer timing as described in section 2.2.2 , to perform a dma transfer from peripheral to memory, two bus accesses are required: ? one access over peripheral port triggered by peripheral request, which needs: ? dma peripheral port request arbitration ? peripheral address computation ? reading data from the peripheral to dma fifo (dma source) ? one access over memory port which can be triggered by the fifo threshold (when fifo mode is used) or immediately after peri pheral read (when direct mode is used), which needs: ? dma memory port request arbitration ? memory address computation ? writing loaded data in sram (dma destination) when transferring data from memory to peripheral, two accesses are also required as described in section 2.2.2 : ? first access: dma anticipates peripheral access and reads data from memory and stores it in fifo to ensure an immediate data transfer as soon as dma peripheral request is triggered. this operation needs: ? dma memory port request arbitration ? memory address computation ? reading data from memory to dma fifo (dma source) ? second access: when peripheral request is triggered, a transfer is generated on dma peripheral port. this operation needs: ? dma peripheral port request arbitration ? peripheral address computation ? writing loaded data at peripheral address (dma destination) as a general rule, the total transfer time by dma stream t s is equal to: t s = t sp (peripheral access/transfer time) + t sm (memory access/transfer time) with: t sp is the total timing for dma peripheral port access and transfer which is equal to: t sp = t pa + t pac + t bma + t edt + t bs
how to predict dma latencies AN4031 28/36 docid022648 rev 1 where: ? t sm is the total timing for dma memory port access and transfer which is equal to: t sm = t ma + t mac + t bma + t sram where: 3.1.2 dma transfer time versus concurrent access additional latency can be added to the dma service timing described in section 3.1.1 when several masters try to access si multaneously to the same slave. for peripheral and memory worst-case access/transfer time, the following factors impact the total delay time for dma stream service: ? when several masters are accessing the sa me ahb destination simultaneously, the dma latency is impacted; the dma transfer ca nnot start until the bus matrix arbiter grants access to the dma as described in section 2.1.2 . ? when several masters (dma and cpu) ar e accessing the same ahb-to-apb bridge, the dma transfer time is delayed due to the ahb-to-apb bridge arbitration as described in section 2.3.2 . table 7. peripheral port access/transfer time versus dma path used description through bus matrix dma?s direct paths to ahb peripherals to apb peripherals t pa : dma peripheral port arbitration 1 ahb cycle 1 ahb cycle 1 ahb cycle t pac : peripheral address computation 1 ahb cycle 1 ahb cycle 1 ahb cycle t bma : bus matrix arbitration (when no concurrency) (1) 1. in the case of stm32f401, t bma is equal to zero. 1 ahb cycle 1 ahb cycle n/a t edt : effective data transfer 1 ahb cycle (2) (3) 2. for fmc, an additional cycle can be added depending on the external memory used. additional ahb cycles are added depending on external memory timings. 3. in case of burst, the effective data transfer time depends on the burst length (inc4 t edt = 4 ahb cycles). 2 apb cycles 2 apb cycle t bs : bus synchronization n/a 1 ahb cycle 1 ahb cycle table 8. memory port access/transfer time description latency t ma : dma memory port ar bitration 1 ahb cycle t mac : memory address computation 1 ahb cycle t bma : bus matrix arbitration (when no concurrency) (1) 1. in the case of stm32f401, t bma is equal to zero. 1 ahb cycle (2) 2. for consecutive sram accesses (while no other master access es the same sram in-between), t bma = 0 cycle. t sram : sram read or write access 1 ahb cycle
docid022648 rev 1 29/36 AN4031 how to predict dma latencies 35 3.2 examples 3.2.1 adc-to-sram dma transfer this example is applicable to products stm32f2xx, stm32f405, stm32f407, stm32f415, stm32f417, stm32f42x and stm32f43x. the adc is configured in continuous triple interleaved mode. in this mode, it converts continuously one analog input channel at the maximum adc speed (36 mhz). the adc prescaler is set to 2, the sa mpling time is set to 1.5 cycles, and the delay between two consecutive adc samples of the interleaved mode is set to 5 cycles. the dma2 stream0 transfers the adc converted value to an sram buffer. dma2 access to adc is done through direct path; however, dma access to sram is done through the bus matrix. in this example, the total dma latency from the adc dma trigger (adc eoc) to write the adc value on sram is equal to 9 ahb cycles for ahb/apb pr escaler equals 1 and 11 ahb cycles for ahb/apb prescaler equals 2. note: when using fifo, the dma memory port access is launched when reaching the fifo level configured by the user. table 9. dma peripheral (adc) port transfer latency ahb/apb2 frequency f ahb = 72 mhz/ f apb2 = 72 mhz ahb/apb ratio = 1 fahb = 144 mhz/ fapb2 = 72 mhz ahb/apb ratio = 2 transfer time t pa : dma peripheral port arbitration 1 ahb cycle 1 ahb cycle t pac : peripheral address computat ion 1 ahb cycle 1 ahb cycle t bma : bus matrix arbitration n/a (1) 1. dma2 accesses adc through direct path: no bus matrix arbitration. n/a (1) t edt : effective data transfer 2 ahb cycles 4 ahb cycles t bs : bus synchronization 1 ahb cycle 1 ahb cycle t sp : total dma transfer time for pe ripheral port 5 ahb cycles 7 ahb cycles table 10. dma memory (sram) port transfer latency cpu/apb2 frequency fahb = 72mhz/ fapb2=72mhz ahb/apb ratio = 1 fahb = 144mhz/ fapb2=72mhz ahb/apb ratio = 2 transfer time t ma : dma memory port arbitration 1 ahb cycle 1 ahb cycle t mac : memory address computation 1 ahb cycle 1 ahb cycle t bma : bus matrix arbitration 1 ahb cycle (1) 1. in case of dma multiple access to sram, the bus ma trix arbitration is equal to 0 cycle if no other master accessed to the sram in-between. 1 ahb cycle (1) t sram : sram write access 1 ahb cycle 1 ahb cycle t sm : total dma transfer time for memory port 4 ahb cycles 4 ahb cycles
how to predict dma latencies AN4031 30/36 docid022648 rev 1 3.2.2 spi full duplex dma transfer this example is applicable to products stm32f2xx, stm32f405, stm32f407, stm32f415, stm32f417, stm32f42x and stm32f43x, and is based on the spi1 peripheral. two dma requests are configured: ? dma2_stream2 for spi1_rx: this stream is co nfigured to be the highest priority in order to serve in time the spi1 received data , and transfer it from the spi1_dr register to the sram buffer. ? dma2_stream3 for spi1_tx: this stream transfers data from the sram buffer to the spi1_dr register. the ahb frequency is equal to the apb2 frequency (84 mhz) and spi1 is configured to operate at the maximum speed (42 mhz). dma2_stream2 (spi1_rx) is triggered before dma2_stream3 (spi1_tx), which is triggered two ahb cycles later. with this configuration, the cp u is polling infinitely on the i2c1_dr register. knowing that the i2c1 peripheral is mapped on apb1 and that the spi1 pe ripheral is mapped on apb2, the system paths are the following: ? direct path for dma2 to access apb2 (not through bus matrix), ? cpu accesses apb1 th rough bus matrix. the aim is to demonstrate that the dma timi ngs are not impacted by the cpu polling on apb1. the following figure summar izes the dma timing for tran smit and receive modes, as well as the time scheduling for each operation: figure 16. spi full duplex dma transfer time 06y9 $+%b&/. '0$b6wuhdpuhtxhvw '0$b6wuhdpuhtxhvw '0$b6wuhdpwudqvdfwlrq w 3$ w 3$& w ('7 w %6 w 0$& w 65$0 '0$b6wuhdpwudqvdfwlrq w 3$& w 0$ w ('7 w 0$& w 65$0 w %6 6 w 0$
docid022648 rev 1 31/36 AN4031 how to predict dma latencies 35 this figure illustrates the following conclusions: ? cpu polling on apb1 is not impacti ng the dma transfer latency on apb2. ? for the dma2_stream2 (spi1_rx) transaction, at the eighth ahb clock cycle, there is no bus matrix arbitration since it is suppose d that the last master that accessed the sram is dma2 (so no re-arbitration is needed). ? for the dma2_stream3 (spi1_tx) transaction, this stream anticipates the read from sram and writes it on the fifo and then, once triggered, the dma peripheral port (destination is spi1) starts operation. ? for dma2_stream3, the dma peripheral arbitr ation phase (1 ahb cycle) is executed during the dma2_stream2 bus synchronization cycle. this optimization is always executed like this when the dm a request is triggered before the end of a current dma request transaction.
tips and warnings while programming the dma controller AN4031 32/36 docid022648 rev 1 4 tips and warnings while programming the dma controller 1. software sequence to disable dma to switch off a peripheral connected to a dma stream request, it is mandatory to: ? switch off the dma stream to wh ich the peripheral is connected, ? wait until the en bit in dma_ sxcr register is reset (?0?). only then can the peripheral be safely disabled. note: in both cases, a transfer complete interrupt flag (tcif in dma_lisr or dma_hisr) is set to indicate the end of transfer due to the stream disable. 2. dma flag management before enabling a new transfer before enabling a new transfer, the user must ensure that the transf er complete interrupt flag (tcif) in dma_lisr or dma_hisr is cleared. as a general recommendation, it is advised to clear all flags in the dma_lifcr and dma_hifcr registers before starting a new transfer. 3. software sequence to enable dma the following software sequence applies when enabling dma: ? configure the suitable dma stream. ? enable the dma stream used (set t he en bit in the dma_sxcr register). ? enable the peripheral used. note: if the user enables the used peripheral before the corresponding dma stream, a ?feif? (fifo error interrupt flag) may be set due to the fact the dma is not ready to provide the first required data to the peripheral (i n case of memory-to-peripheral transfer). 4. memory-to-memory transfer while ndtr=0 when configuring a dma stream to perform a memory-to-memory transfer in normal mode, once ndtr reaches 0, the transfer complete is set. at that time, if the user sets the enable bit (en bit in dma_sxcr) of this stream, the memory-to-memory transfer is automatically re-triggered again with the last ndtr value. 5. dma peripheral burst with pinc/minc=0 dma burst feature with peripheral address in crement (pinc) or memory address increment (minc) disable allows to address internal or external (fsmc) peripherals supporting burst (embedding fifos). this mode ensures that this dma stream cannot be interrupted by other dma streams during its transactions.
docid022648 rev 1 33/36 AN4031 tips and warnings while programming the dma controller 35 6. twice-mapped dma requests when the user configures two (or more) dma streams to serve the same peripheral request, software should ensure that t he current dma stream is completely disabled (by polling the en bit in the dma_sxcr register) before enabling a new dma stream. 7. best dma throughput configuration when using stm32f4xx with reduced ahb frequency while dma is servicing a high-speed peripheral, it is recommended to put the stack and heap in the ccm (which can be addressed directly by the cpu through d-bus) instead of putting them on the sram, which would create an additional concurrency between cpu and dma accessing the sram memory. 8. dma transfer suspension at any time, a dma transfer can be suspended to be restarted later on or to be definitively disabled before the end of the dma transfer. there are two cases: ? the stream disables the transfer with no later-on restart from the point where it was stopped: there is no particular action to do, except to clear the en bit in the dma_sxcr register to disable the stream and to wait until the en bit is reset. as a consequence: ? the dma_sxndtr register contains the number of remaining data items at the moment when the stream was stopped so that the software can determine how many data items have been transferre d before the stream was interrupted. ? the stream suspends the transfer in order to resume it later by re-enabling the stream: to restart from the point w here the transfer was stopped, the software has to read the dma_sxndtr register after disabling the stream (en bit at ?0?) to know the number of data items already collected. then: ? the peripheral and/or memory addresses have to be updated in order to adjust the address pointers. ? the sxndtr register has to be updated with the remaining number of data items to be transferred (the value read when the stream was disabled). ? the stream may then be re-enabled to restart the transfer from the point where it was stopped. note: in both cases, a transfer complete interrupt flag (tcif in dma_lisr or dma_hisr) is set to indicate the end of transfer due to the stre am interruption.
conclusion AN4031 34/36 docid022648 rev 1 5 conclusion the stm32f2/f4 dma controller is designed to cover most of the embedded use case applications by: ? giving flexibility to firmware to choose the suitable combination between 16 streams x 16 channels (eight for each dma), ? reducing the total latency time for a dma transfer, thanks to dual ahb port architecture, and dire ct path to apb bridges avoiding cpu stall on ahb1 access when dma is servicing low- speed apb peripherals, ? fifos implementation on dma a llows more flexibility to firm ware to configure different data sizes between source and destinati on, and speeds-up transfers when using incremental burst transfer mode.
docid022648 rev 1 35/36 AN4031 revision history 35 6 revision history table 11. document revision history date revision changes 04-feb-2014 1 initial release.
AN4031 36/36 docid022648 rev 1 please read carefully: information in this document is provided solely in connection with st products. stmicroelectronics nv and its subsidiaries (?st ?) reserve the right to make changes, corrections, modifications or improvements, to this document, and the products and services described he rein at any time, without notice. all st products are sold pursuant to st?s terms and conditions of sale. purchasers are solely responsible for the choice, selection and use of the st products and services described herein, and st as sumes no liability whatsoever relating to the choice, selection or use of the st products and services described herein. no license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted under this document. i f any part of this document refers to any third party products or services it shall not be deemed a license grant by st for the use of such third party products or services, or any intellectual property contained therein or considered as a warranty covering the use in any manner whatsoev er of such third party products or services or any intellectual property contained therein. unless otherwise set forth in st?s terms and conditions of sale st disclaims any express or implied warranty with respect to the use and/or sale of st products including without limitation implied warranties of merchantability, fitness for a parti cular purpose (and their equivalents under the laws of any jurisdiction), or infringement of any patent, copyright or other intellectual property right. st products are not designed or authorized for use in: (a) safety critical applications such as life supporting, active implanted devices or systems wi th product functional safety requirements; (b) aeronautic applications; (c) automotive applications or environments, and/or (d) aerospace applications or environments. where st products are not designed for such use, the purchaser shall use products at purchaser?s sole risk, even if st has been informed in writing of such usage, unless a product is expressly designated by st as being intended for ?automotive, automotive safety or medical? industry domains according to st product design specifications . products formally escc, qml or jan qualified are deemed suitable for use in aerospace by the corresponding governmental agency. resale of st products with provisions different from the statements and/or technical features set forth in this document shall immediately void any warranty granted by st for the st product or service described herein and shall not create or extend in any manner whatsoev er, any liability of st. st and the st logo are trademarks or registered trademarks of st in various countries. information in this document supersedes and replaces all information previously supplied. the st logo is a registered trademark of stmicroelectronics. all other names are the property of their respective owners. ? 2014 stmicroelectronics - all rights reserved stmicroelectronics group of companies australia - belgium - brazil - canada - china - czech republic - finland - france - germany - hong kong - india - israel - ital y - japan - malaysia - malta - morocco - philippines - singapore - spain - sweden - switzerland - united kingdom - united states of america www.st.com

▲Up To Search▲

Price & Availability of AN4031

	To Download AN4031 Datasheet File
If you can't view the Datasheet, Please click here to try to view without PDF Reader .